A Lightweight Semantic Chunker Based on Tagging

نویسنده

  • Kadri Hacioglu
چکیده

In this paper, a framework for the development of a fast, accurate, and highly portable semantic chunker is introduced. The framework is based on a non-overlapping, shallow tree-structured language. The derivation of the tree is considered as a sequence of tagging actions in a predefined linguistic context, and a novel semantic chunker is accordingly developed. It groups the phrase chunks into the arguments of a given predicate in a bottom-up fashion. This is quite different from current approaches to semantic parsing or chunking that depend on full statistical syntactic parsers that require tree bank style annotation. We compare it with a recently proposed word-byword semantic chunker and present results that show that the phrase-by-phrase approach performs better than its word-by-word counterpart.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Chunker and Shallow Parser for Free Word Order Languages: An Approach based on Valency Theory and Feature Structures

Free word order languages have relatively unrestricted local word group or phrase structures that make the problem of chunking quite challenging. On the other hand, a robust chunker can drastically reduce the complexity of a parser that follows. We present here a computational framework for chunking of free word order languages based on a generalization of the valency theory. Every word has cer...

متن کامل

Semantic Role Labeling by Tagging Syntactic Chunks

In this paper, we present a semantic role labeler (or chunker) that groups syntactic chunks (i.e. base phrases) into the arguments of a predicate. This is accomplished by casting the semantic labeling as the classification of syntactic chunks (e.g. NP-chunk, PP-chunk) into one of several classes such as the beginning of an argument (B-ARG), inside an argument (I-ARG) and outside an argument (O)...

متن کامل

SEIMCHA: a new semantic image CAPTCHA using geometric transformations

As protection of web applications are getting more and more important every day, CAPTCHAs are facing booming attention both by users and designers. Nowadays, it is well accepted that using visual concepts enhance security and usability of CAPTCHAs. There exist few major different ideas for designing image CAPTCHAs. Some methods apply a set of modifications such as rotations to the original imag...

متن کامل

POS Tagger and Chunker for Tamil Language

This paper presents the Part Of Speech tagger and Chunker for Tamil using Machine learning techniques. Part Of Speech tagging and chunking are the fundamental processing steps for any language processing task. Part of speech (POS) tagging is the process of labeling automatic annotation of syntactic categories for each word in a corpus. Chunking is the task of identifying and segmenting the text...

متن کامل

Chinese Character-based Segmentation & POS-tagging and Named Entity Identification with a CRF Chunker

In this paper, we propose a character-based conditional random field (CRF) chunker to identify Chinese named entity words in the text files. The input for it is from a character-based tagger in which the segmentation and partof-speech (POS) tagging are conducted simultanueously. The character-based tagger is trained by using a corpus in which each character is tagged with both its position (POC...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004